1 Introduction

TCGAbiolinksGUI was created to help users more comfortable with graphical user interfaces (GUI) to search, download and analyze Cancer data. It offers a graphical user interface to the R/Bioconductor package TCGAbiolinks (A. Colaprico et al. 2016), which is able to access The National Cancer Institute (NCI) Genomic Data Commons (GDC) thorough its
GDC Application Programming Interface (API). Additional packages from Bioconductor are included, such as ComplexHeatmap package (Gu, Eils, and Schlesner 2016) to aid in visualizing the data, ELMER (Yao et al. 2015) to identify regulatory enhancers using gene expression + DNA methylation data + motif analysis and Pathview (Luo and Brouwer 2013) for pathway-based data integration and visualization.

The GUI was created using Shiny, a Web Application Framework for R, and uses several packages to provide advanced features that can enhance Shiny apps, such as shinyjs to add JavaScript actions for the app, shinydashboard to add dashboards and shinyFiles to provide an API for client side access to the server file system. A running version of the GUI is found in (shinyapps.io)[https://tcgabiolinksgui.shinyapps.io/demo/]

2 Starting with TCGAbiolinksGUI

2.1 Installation

To install the package from the (Bioconductor repository)[http://bioconductor.org/packages/TCGAbiolinksGUI/] please use the following code. To date, the package is only available in the (devel version of Bioconductor)[https://www.bioconductor.org/developers/how-to/useDevel/], but it should be available in the release version until May 2017.

source("https://bioconductor.org/biocLite.R")
biocLite("TCGAbiolinksGUI", dependencies = TRUE)

To install the development version of the package via GitHub:

source("https://bioconductor.org/biocLite.R")
deps <- c("pathview","clusterProfiler","ELMER", "DO.db","GO.db", "ComplexHeatmap","EDASeq", "TCGAbiolinks")
for(pkg in deps)  if (!pkg %in% installed.packages()) biocLite(pkg, dependencies = TRUE)
deps <- c("devtools","shape","shiny","readr","googleVis","shinydashboard","shinyFiles","shinyjs","shinyBS")
for(pkg in deps)  if (!pkg %in% installed.packages())  install.packages(pkg,dependencies = TRUE)
devtools::install_github("BioinformaticsFMRP/TCGAbiolinksGUI")

2.2 Docker image

TCGAbiolinksGUI is available as Docker image, which can be easily run on Mac OS, Windows and Linux systems. The image can be obtained from Docker Hub: https://hub.docker.com/r/tiagochst/tcgabiolinksgui/

Download image:

docker pull tiagochst/tcgabiolinksgui

To run R from the command line:

docker run -ti tiagochst/tcgabiolinksgui R

To run RStudio Server (user: rstudio, password: rstudio):

docker run -p 8787:8787 tiagochst/tcgabiolinksgui

For more information please check: https://docs.docker.com/

2.3 Quick start

The following commands should be used to start the graphical user interface.

library(TCGAbiolinksGUI)
TCGAbiolinksGUI()

2.4 Video tutorials

To facilitate the use of this package, we have created some tutorial videos demonstrating the tool. Some sections have video tutorials that if clicked will redirect to the video on youtube. For the complete list of videos, please check this youtube list.

2.5 PDF tutorials

For each section we created some PDFs with detailing the steps of each section: Link to folder with PDFs

2.6 Question and issues

Please use Github issues if you want to file bug reports or feature requests.

2.7 Data input summary

Menu Sub-menu Button Data input
Clinical analysis Profile Plot Select file A table with at least two categorical columns
Clinical analysis Survival Plot Select file A table with at least the following columns: days_to_death, days_to_last_followup and one column with a group
Epigenetic analysis Differential methylation analysis Select data (.rda) A summarizedExperiment object
Epigenetic analysis Volcano Plot Select results A csv file with the following pattern: DMR_results_GroupCol_group1_group2_pcut_1e-30_meancut_0.55.csv (Where GroupCol, group1, group2 are the names of the columns selected in the DMR steps.
Epigenetic analysis Heatmap plot Select file A summarizedExperiment object
Epigenetic analysis Heatmap plot Select results Same as Epigenetic analysis >Volcano Plot > Select results
Epigenetic analysis Mean DNA methylation Select file A summarizedExperiment object
Transcriptomic Analysis Volcano Plot Select results A csv file with the following pattern: DEA_results_GroupCol_group1_group2_pcut_1e-30_meancut_0.55.csv (Where GroupCol, group1, group2 are the names of the columns selected in the DEA steps.
Transcriptomic Analysis Heatmap plot Select file A summarizedExperiment object
Transcriptomic Analysis OncoPrint plot Select MAF file A MAF file (columns needed: Hugo_Symbol,Tumor_Sample_Barcode,Variant_Type)
Transcriptomic Analysis OncoPrint plot Select Annotation file A file with at least the following columns: bcr_patient_barcode
Integrative analysis Starburst plot DMR result A csv file with the following pattern: DMR_results_GroupCol_group1_group2_pcut_1e-30_meancut_0.55.csv (Where GroupCol, group1, group2 are the names of the columns selected in the DMR steps.
Integrative analysis Starburst plot DEA result A csv file with the following pattern: DEA_results_GroupCol_group1_group2_pcut_1e-30_meancut_0.55.csv (Where GroupCol, group1, group2 are the names of the columns selected in the DEA steps.
Integrative analysis ELMER Create mee > Select DNA methylation object An rda file with a summarized Experiment object
Integrative analysis ELMER Select results > Select expression object An rda file with the RNAseq data frame
Integrative analysis ELMER Select mee An rda file with a mee object
Integrative analysis ELMER Select results An rda file with the results of the ELMER analysis

3 Citation

Please cite both TCGAbiolinks package and TCGAbiolinksGUI:

  • Silva TC, Colaprico A, Olsen C, Bontempi G, Ceccarelli M, Berman BP. , and Noushmehr H. “TCGAbiolinksGUI: A Graphical User Interface to analyze cancer molecular and clinical data.”Bioinformatics - Submitted for review.
  • Colaprico A, Silva TC, Olsen C, Garofano L, Cava C, Garolini D, Sabedot T, Malta TM, Pagnotta SM, Castiglioni I, Ceccarelli M, Bontempi G and Noushmehr H. “TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data.” Nucleic acids research (2015): gkv1507.

Other related publications to this package:

  • “TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages”. F1000Research 10.12688/f1000research.8923.1 (T. Silva et al. 2016)

If you used ELMER please cite:

  • Yao, L., Shen, H., Laird, P. W., Farnham, P. J., & Berman, B. P. “Inferring regulatory element landscapes and transcription factor networks from cancer methylomes.” Genome Biol 16 (2015): 105.
  • Yao, Lijing, Benjamin P. Berman, and Peggy J. Farnham. “Demystifying the secret mission of enhancers: linking distal regulatory elements to target genes.” Critical reviews in biochemistry and molecular biology 50.6 (2015): 550-573.

If you used OncoPrint plot and Heatmap Plot please cite:

  • Gu, Zuguang, Roland Eils, and Matthias Schlesner. “Complex heatmaps reveal patterns and correlations in multidimensional genomic data.” Bioinformatics (2016): btw313

If you used Pathway plot please cite:

  • Luo, Weijun, Brouwer and Cory (2013). “Pathview: an R/Bioconductor package for pathway-based data integration and visualization.” Bioinformatics, 29(14), pp. 1830-1831.

4 Session info

sessionInfo()
## R Under development (unstable) (2017-01-23 r72020)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.2 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] TCGAbiolinksGUI_0.99.18    shinydashboard_0.5.3      
##  [3] DT_0.2                     dplyr_0.5.0               
##  [5] SummarizedExperiment_1.4.0 Biobase_2.34.0            
##  [7] GenomicRanges_1.26.3       GenomeInfoDb_1.10.3       
##  [9] IRanges_2.8.1              S4Vectors_0.12.1          
## [11] BiocGenerics_0.20.0        TCGAbiolinks_2.3.18       
## [13] RTCGAToolbox_2.4.0        
## 
## loaded via a namespace (and not attached):
##   [1] rtracklayer_1.34.2          ggthemes_3.4.0             
##   [3] prabclus_2.2-6              minet_3.32.0               
##   [5] R.methodsS3_1.7.1           pkgmaker_0.22              
##   [7] tidyr_0.6.1                 bumphunter_1.14.0          
##   [9] minfi_1.20.2                ggplot2_2.2.1              
##  [11] knitr_1.15.13               aroma.light_3.4.0          
##  [13] R.utils_2.5.0               data.table_1.10.4          
##  [15] hwriter_1.3.2               KEGGREST_1.14.0            
##  [17] RCurl_1.95-4.8              GEOquery_2.40.0            
##  [19] doParallel_1.0.10           GenomicFeatures_1.26.3     
##  [21] preprocessCore_1.36.0       RSQLite_1.1-2              
##  [23] BiocStyle_2.2.1             xml2_1.1.1                 
##  [25] httpuv_1.3.3                assertthat_0.1             
##  [27] viridis_0.3.4               evaluate_0.10              
##  [29] BiocInstaller_1.24.0        DEoptimR_1.0-8             
##  [31] caTools_1.17.1              dendextend_1.4.0           
##  [33] Rgraphviz_2.18.0            km.ci_0.5-2                
##  [35] igraph_1.0.1                DBI_0.5-1                  
##  [37] geneplotter_1.52.0          htmlwidgets_0.8            
##  [39] reshape_0.8.6               EDASeq_2.8.0               
##  [41] matlab_1.0.2                selectr_0.3-1              
##  [43] ggpubr_0.1.1.999            backports_1.0.5            
##  [45] trimcluster_0.1-2           annotate_1.52.1            
##  [47] biomaRt_2.30.0              pathview_1.14.0            
##  [49] withr_1.0.2                 robustbase_0.92-7          
##  [51] googleVis_0.6.2             GenomicAlignments_1.10.0   
##  [53] c3net_1.1.1                 mclust_5.2.2               
##  [55] mnormt_1.5-5                cluster_2.0.5              
##  [57] DOSE_3.0.10                 ape_4.1                    
##  [59] lazyeval_0.2.0              genefilter_1.56.0          
##  [61] edgeR_3.16.5                nlme_3.1-131               
##  [63] nnet_7.3-12                 devtools_1.12.0            
##  [65] RJSONIO_1.3-0               diptest_0.75-7             
##  [67] miniUI_0.1.1                colourpicker_0.3           
##  [69] downloader_0.4              registry_0.3               
##  [71] affyio_1.44.0               rprojroot_1.2              
##  [73] matrixStats_0.51.0          shinyFiles_0.6.2           
##  [75] graph_1.52.0                rngtools_1.2.4             
##  [77] base64_2.0                  Matrix_1.2-8               
##  [79] KMsurv_0.1-5                zoo_1.7-14                 
##  [81] whisker_0.3-2               GlobalOptions_0.0.10       
##  [83] png_0.1-7                   rjson_0.2.15               
##  [85] bitops_1.0-6                R.oo_1.21.0                
##  [87] ConsensusClusterPlus_1.38.0 KernSmooth_2.23-15         
##  [89] Biostrings_2.42.1           doRNG_1.6                  
##  [91] shape_1.4.2                 stringr_1.2.0              
##  [93] qvalue_2.6.0                nor1mix_1.2-2              
##  [95] ShortRead_1.32.0            dnet_1.0.11                
##  [97] readr_1.0.0                 scales_0.4.1               
##  [99] memoise_1.0.0               magrittr_1.5               
## [101] plyr_1.8.4                  hexbin_1.27.1              
## [103] gplots_3.0.1                gdata_2.17.0               
## [105] zlibbioc_1.20.0             compiler_3.4.0             
## [107] RColorBrewer_1.1-2          illuminaio_0.16.0          
## [109] KEGGgraph_1.32.0            Rsamtools_1.26.1           
## [111] affy_1.52.0                 XVector_0.14.0             
## [113] MASS_7.3-45                 stringi_1.1.2              
## [115] shinyBS_0.62                yaml_2.1.14                
## [117] GOSemSim_2.0.4              locfit_1.5-9.1             
## [119] supraHex_1.12.0             latticeExtra_0.6-28        
## [121] ggrepel_0.6.5               survMisc_0.5.4             
## [123] grid_3.4.0                  fastmatch_1.1-0            
## [125] tools_3.4.0                 circlize_0.3.9             
## [127] foreach_1.4.3               foreign_0.8-67             
## [129] git2r_0.18.0                gridExtra_2.2.1            
## [131] digest_0.6.12               shiny_1.0.0                
## [133] quadprog_1.5-5              fpc_2.1-10                 
## [135] Rcpp_0.12.9.2               siggenes_1.48.0            
## [137] broom_0.4.2                 OrganismDbi_1.16.0         
## [139] httr_1.2.1                  survminer_0.3.0            
## [141] AnnotationDbi_1.36.2        RCircos_1.2.0              
## [143] ComplexHeatmap_1.12.0       psych_1.6.12               
## [145] kernlab_0.9-25              colorspace_1.3-2           
## [147] rvest_0.3.2                 XML_3.98-1.5               
## [149] splines_3.4.0               RBGL_1.50.0                
## [151] multtest_2.30.0             flexmix_2.3-13             
## [153] xtable_1.8-2                jsonlite_1.2               
## [155] ELMER_1.4.0                 modeltools_0.2-21          
## [157] R6_2.2.0                    htmltools_0.3.5            
## [159] mime_0.5                    clusterProfiler_3.2.11     
## [161] BiocParallel_1.8.1          DESeq_1.26.0               
## [163] class_7.3-14                beanplot_1.2               
## [165] codetools_0.2-15            fgsea_1.1.2                
## [167] mvtnorm_1.0-5               lattice_0.20-34            
## [169] tibble_1.2                  curl_2.3                   
## [171] gtools_3.5.0                shinyjs_0.9.0.9000         
## [173] GO.db_3.4.0                 openssl_0.9.6              
## [175] survival_2.40-1             limma_3.30.11              
## [177] rmarkdown_1.3               munsell_0.4.3              
## [179] parmigene_1.0.2             DO.db_2.9                  
## [181] GetoptLong_0.1.5            iterators_1.0.8            
## [183] reshape2_1.4.2              gtable_0.2.0

References

Gu, Zuguang, Roland Eils, and Matthias Schlesner. 2016. “Complex Heatmaps Reveal Patterns and Correlations in Multidimensional Genomic Data.” Bioinformatics. doi:10.1093/bioinformatics/btw313.

Luo, Weijun, and Cory Brouwer. 2013. “Pathview: An R/Bioconductor Package for Pathway-Based Data Integration and Visualization.” Bioinformatics 29 (14). Oxford Univ Press: 1830–1.

Silva, TC, A Colaprico, C Olsen, F D’Angelo, G Bontempi, M Ceccarelli, and H Noushmehr. 2016. “TCGA Workflow: Analyze Cancer Genomics and Epigenomics Data Using Bioconductor Packages [Version 2; Referees: 1 Approved, 1 Approved with Reservations].” F1000Research 5 (1542). doi:10.12688/f1000research.8923.2.

Yao, L, H Shen, PW Laird, PJ Farnham, and BP Berman. 2015. “Inferring Regulatory Element Landscapes and Transcription Factor Networks from Cancer Methylomes.” Genome Biology 16 (1): 105–5.